{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# AlexNet  \n",
    "**网络结构示意图**：<div align=\"center\"><img src=\"https://s2.loli.net/2023/10/09/uozp8yhJYCfBs71.png\" alt=\"202310091108093\" style=\"zoom:100%;\"/></div>  \n",
    "\n",
    "此图为作者在两块GPU上的训练表现，所以对于channel需要乘2（转化为1块GPU上的训练）"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**论文创新点：**  \n",
    "\n",
    "1、为了避免**全连接层**出现过拟合，**AlexNey**在模型里面添加```dropout```（模型优化方法：在全连接层，避免所有的神经元都起到作用，也就是说选择性的忽略部分神经元）（原始论文使用的数据集为```LSVRC-2010```包含1000种种类的数据）  \n",
    "\n",
    "2、**通过GPU进行并行训练**。原文作者使用的是```GTX 580```只有3GB，对于如此大的训练集是完全不够用的，因此作者思路是：借助GPU之间的并行（两块（作者使用的是两块）GPU之间彼此交换memory）作者对此描述为:  \n",
    " 将一半的内核（或神经元）放在每个 GPU 上，还有一个额外的技巧：GPU 仅在某些层中进行通信。这意味着，例如，第 3 层的内核从第 2 层中的所有内核映射获取输入。但是，第 4 层中的内核仅从驻留在同一 GPU 上的第 3 层中的那些内核映射获取输入\n",
    "> The parallelization scheme that we employ essentially puts half of the kernels (or neurons) on each GPU, with one additional trick: the GPUs communicate only in certain layers. This means that, for example, the kernels of layer 3 take input from all kernel maps in layer 2. However, kernels in layer 4 take input only from those kernel maps in layer 3 which reside on the same GPU.\n",
    "\n",
    "3、**overlapping pooling** \n",
    "作者在原始论文中提到通过对池化层设置：stride=2、kernelsize=3x3，可以提高判断的准确率以及模型**更难过拟合**。也就是说通过设置：$s<z$来设置*overlapping pooling*。争对添加**overlapping pooling**来避免**过拟合**。卷积神经网络：卷积层相当于回归系数，池化层相当于选择代表。避免过拟合，对于图片只有充分的获取图片信息才能更加好的增强模型的泛化能力，当选择$s=z$时，只对部分区域（且该区域不会在后续出现）选择出一个“代表”。  \n",
    "> https://qr.ae/pKG1jG\n",
    "\n",
    "\n",
    "4、**Local Response Normalization(LRN)**，计算公式如下：\n",
    "$$\n",
    "b_{x,y}^{i}= a_{x,y}^{i}/(k+ \\alpha \\sum_{j=max(0,i-n/2)}^{min(N-1,i+n/2)}(a_{x,y}^{j})^2)^\\beta\n",
    "$$  \n",
    "$a(i,x,y)$：代表第i个卷积核在特征图中(x, y)位置的输出（经过 ReLU）  \n",
    "$b(i,x,y)$：代表局部响应正常化的输出  \n",
    "$N$：卷积核的数量  \n",
    "$n$：调整卷积核数量，作者选择的为5  \n",
    "$k,\\alpha, \\beta$：可调节参数，作者选择：$k=2,\\alpha =10e-4, \\beta= 0.75$  \n",
    "计算列子如下：  \n",
    "<div align=\"center\"><img src=\"https://miro.medium.com/v2/resize:fit:1400/format:webp/1*DmnOhSTIzn04sC0w1d3FPg.png\" alt=\"LRN计算列子\" style=\"zoom:40%;\"/></div>  \n",
    "\n",
    "此时定义卷积核数量为：4；$k=0, \\alpha= 1, \\beta= 1, n=2$，此时LRN为:$b(i,x,y)= a(i,x,y)/ \\sum a(i,x,y)^2$；计算如下：对于$(0,0,0)= ((0,0,0)^2+ (1,0,0)^2)= (1/(1+1))$，$(1,0,0)=((0,0,0)+(1,0,0)+ (2,0,0))= 1/(1+1+4)$\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "本次模型使用的数据集：http://download.tensorflow.org/example_images/flower_photos.tgz    \n",
    "该集合包含5中花的种类"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "cpu\n"
     ]
    }
   ],
   "source": [
    "import torch\n",
    "import torch.nn as nn\n",
    "import os\n",
    "from shutil import copy\n",
    "import random\n",
    "from PIL import Image\n",
    "import matplotlib.pyplot as plt\n",
    "from torchvision import transforms\n",
    "from torch.utils.data import DataLoader\n",
    "import torch.optim as optim\n",
    "import torchvision\n",
    "os.environ[\"KMP_DUPLICATE_LIB_OK\"]=\"TRUE\"\n",
    "\n",
    "device = ('cuda' if torch.cuda.is_available() else 'cpu')\n",
    "print(device)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 对数据集合分类\n",
    "class File_Split():\n",
    "    \"\"\"\n",
    "    对文件进行分类\n",
    "\n",
    "    输入：花种类数据\n",
    "    输出：划分test、train数据（9:1）\n",
    "    \"\"\"\n",
    "    def __init__(self, file):\n",
    "        self.file = file\n",
    "\n",
    "    def make_file(self, file_path):\n",
    "        \"\"\"\n",
    "        对于数据集，构建训练集，测试集文件夹\n",
    "        \"\"\"\n",
    "        if not os.path.exists(file_path):\n",
    "            os.makedirs(file_path)\n",
    "    # file = '../data/flower_photos'\n",
    "\n",
    "    def file_split(self):\n",
    "        \"\"\"\n",
    "        数据划分，将指定数据丢到指定的文件夹里面，实现数据的分类\n",
    "        \"\"\"\n",
    "        flower_label = [cla for cla in os.listdir(self.file)] # 输出所有花类的类别\n",
    "        # 构建训练集文件夹\n",
    "        for label in flower_label:\n",
    "            self.make_file('../data/flower_data/train/'+ label)\n",
    "        # 构建测试集文件夹\n",
    "        for label in flower_label:\n",
    "            self.make_file('../data/flower_data/test/'+ label)\n",
    "        splite_rate = 0.1 # 选择划分比例\n",
    "\n",
    "        for label in flower_label:\n",
    "            path = self.file+ '/'+ label+ '/'\n",
    "            image_list = os.listdir(path) # 对于一类花的路径全部存储\n",
    "            eval_index = random.sample(image_list, k= int(len(image_list)* splite_rate), random_state=) # 得到划分的数据路径\n",
    "            for image in image_list:\n",
    "                if image in eval_index:\n",
    "                    # 测试集数据\n",
    "                    image_path = path+ image\n",
    "                    new_path = '../data/flower_data/test/'+ label\n",
    "                    copy(image_path, new_path)\n",
    "            for image in image_list:\n",
    "                if image not in eval_index:\n",
    "                    # 训练集数据\n",
    "                    image_path = path+ image\n",
    "                    new_path = '../data/flower_data/train/'+ label\n",
    "                    copy(image_path, new_path)\n",
    "        print('Done!!')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "# file = '../data/flower_photos'\n",
    "# fs = File_Split(file= file)\n",
    "# fs.file_split()\n",
    "\n",
    "# 得到图片的均值与方差\n",
    "def getMeanAndBias(dataset):\n",
    "    '''\n",
    "        输入：\n",
    "            iter_dataloader 要求：元素为tensor类型、转换为torch.utils.data.Dataset类\n",
    "    '''\n",
    "    # 注意batch_size的用法。这种条件下，数据集会被逐样本分割，而不是成批在一起\n",
    "    data_iter = torch.utils.data.DataLoader(dataset, batch_size=1, shuffle=False, num_workers=0)\n",
    "\n",
    "    means = torch.zeros(3)\n",
    "    bias = torch.zeros(3)\n",
    "\n",
    "    for img, _ in data_iter:\n",
    "        for d in range(3):\n",
    "            # 注意针对单个通道计算平均值和标准差的方法\n",
    "            means[d] += img[:, d, :, :].mean()\n",
    "            bias[d] += img[:, d, :, :].std()\n",
    "    \n",
    "    means = means / len(data_iter)\n",
    "    bias = bias / len(data_iter)\n",
    "    return [means, bias]\n",
    "\n",
    "# 构建AlexNet网络\n",
    "class AlexNet(nn.Module):\n",
    "    def __init__(self, num_label= 5):\n",
    "        super(AlexNet, self).__init__()\n",
    "        self.layer1 = nn.Sequential(\n",
    "            nn.Conv2d(3, 96, kernel_size= 11, stride=4, padding=2), # 3x224x224 -- 96x55x55 \n",
    "            nn.ReLU(inplace=True),\n",
    "            nn.MaxPool2d(kernel_size=3, stride=2), # 96x55x55 -- 96x27x27\n",
    "        ) # 96x27x27\n",
    "        self.norm1 = nn.LocalResponseNorm(size=5, alpha=1e-4, beta= 0.75, k=2)\n",
    "        self.layer2 = nn.Sequential(\n",
    "            nn.Conv2d(96, 256, kernel_size= 5, padding= 2),  # 256x27x27\n",
    "            nn.ReLU(inplace=True),\n",
    "            nn.MaxPool2d(kernel_size=3, stride=2), # 256x13x13\n",
    "        ) # 256x13x13\n",
    "        self.norm2 = nn.LocalResponseNorm(size=5, alpha=1e-4, beta= 0.75, k=2)\n",
    "        self.layer3 = nn.Sequential(\n",
    "            # 第3、4、5层不进行池化操作\n",
    "            # 于此同时 3，4，5层的<H,W>不发生改变，需要单独设计填充以及步长\n",
    "            nn.Conv2d(256, 384, kernel_size=3, stride=1, padding=1), # 384x13x13\n",
    "            nn.ReLU(inplace=True),\n",
    "            nn.Conv2d(384, 384, kernel_size=3, stride=1, padding=1), # 384x13x13\n",
    "            nn.ReLU(inplace=True),\n",
    "            nn.Conv2d(384, 256, kernel_size=3, stride=1, padding=1), # 256x13x13\n",
    "            nn.ReLU(inplace=True),\n",
    "            nn.MaxPool2d(kernel_size=3, stride=2) # 256x6x6\n",
    "        )\n",
    "        self.fc = nn.Sequential(\n",
    "            nn.Linear(256*6*6, 4096),\n",
    "            nn.ReLU(inplace=True),\n",
    "            nn.Dropout(p=0.5, inplace= False),\n",
    "            nn.Linear(4096, 4096),\n",
    "            nn.ReLU(inplace=True),\n",
    "            nn.Dropout(p=0.5, inplace= False),\n",
    "            nn.Linear(4096, num_label)\n",
    "        )\n",
    "    def forward(self, x):\n",
    "        x = self.layer1(x)\n",
    "        x = self.layer2(x)\n",
    "        x = self.layer3(x)\n",
    "        x = x.view(-1,6*6*256)\n",
    "        x = self.fc(x)\n",
    "        return x\n",
    "\n",
    "model = AlexNet().to(device)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "torch.Size([1, 5])"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x = torch.randn((3,224,224), device= device)\n",
    "y = model(x)\n",
    "y.size()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 214,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch:1\n",
      "\n"
     ]
    },
    {
     "ename": "OutOfMemoryError",
     "evalue": "CUDA out of memory. Tried to allocate 144.00 MiB (GPU 0; 2.00 GiB total capacity; 1.56 GiB already allocated; 0 bytes free; 1.59 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF",
     "output_type": "error",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mOutOfMemoryError\u001b[0m                          Traceback (most recent call last)",
      "\u001b[1;32m~\\AppData\\Local\\Temp\\ipykernel_4792\\3125045804.py\u001b[0m in \u001b[0;36m<module>\u001b[1;34m\u001b[0m\n\u001b[0;32m     68\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m     69\u001b[0m         \u001b[0moptimizer\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mzero_grad\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m---> 70\u001b[1;33m         \u001b[0mloss\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mbackward\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m     71\u001b[0m         \u001b[0moptimizer\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mstep\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m     72\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32me:\\Anaconda\\lib\\site-packages\\torch\\_tensor.py\u001b[0m in \u001b[0;36mbackward\u001b[1;34m(self, gradient, retain_graph, create_graph, inputs)\u001b[0m\n\u001b[0;32m    486\u001b[0m                 \u001b[0minputs\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0minputs\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m    487\u001b[0m             )\n\u001b[1;32m--> 488\u001b[1;33m         torch.autograd.backward(\n\u001b[0m\u001b[0;32m    489\u001b[0m             \u001b[0mself\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mgradient\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mretain_graph\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mcreate_graph\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0minputs\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0minputs\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m    490\u001b[0m         )\n",
      "\u001b[1;32me:\\Anaconda\\lib\\site-packages\\torch\\autograd\\__init__.py\u001b[0m in \u001b[0;36mbackward\u001b[1;34m(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs)\u001b[0m\n\u001b[0;32m    195\u001b[0m     \u001b[1;31m# some Python versions print out the first line of a multi-line function\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m    196\u001b[0m     \u001b[1;31m# calls in the traceback and some print out the last line\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 197\u001b[1;33m     Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass\n\u001b[0m\u001b[0;32m    198\u001b[0m         \u001b[0mtensors\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mgrad_tensors_\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mretain_graph\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mcreate_graph\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0minputs\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m    199\u001b[0m         allow_unreachable=True, accumulate_grad=True)  # Calls into the C++ engine to run the backward pass\n",
      "\u001b[1;31mOutOfMemoryError\u001b[0m: CUDA out of memory. Tried to allocate 144.00 MiB (GPU 0; 2.00 GiB total capacity; 1.56 GiB already allocated; 0 bytes free; 1.59 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF"
     ]
    }
   ],
   "source": [
    "# 对数据进行转换\n",
    "\n",
    "# 计算图片均值以及方差\n",
    "# transform = transforms.Compose(\n",
    "#     [transforms.Resize((224,224)),\n",
    "#      transforms.ToTensor()]\n",
    "# )\n",
    "# train_data = torchvision.datasets.ImageFolder('../data/flower_data/train/', transform=transform)\n",
    "# test_data = torchvision.datasets.ImageFolder('../data/flower_data/test/', transform= transform)\n",
    "# train_data_info = getMeanAndBias(train_data)\n",
    "# test_data_info = getMeanAndBias(test_data)\n",
    "\n",
    "data_transform = {\n",
    "    \"train\":transforms.Compose(\n",
    "            [transforms.Resize((224,224)), # 裁剪图片为224x224\n",
    "            transforms.ToTensor(),\n",
    "            transforms.Normalize((0.4669, 0.4252, 0.3044), (0.2449, 0.2186, 0.2238))]),\n",
    "    \"test\": transforms.Compose(\n",
    "        [transforms.Resize((224,224)),\n",
    "         transforms.ToTensor(),\n",
    "         transforms.Normalize((0.4637, 0.4248, 0.3011), (0.2502, 0.2219, 0.2292))]\n",
    "    )}\n",
    "train_data_path = '../data/flower_data/train/'\n",
    "test_data_path = '../data/flower_data/test/'\n",
    "train_data = torchvision.datasets.ImageFolder(\n",
    "    root= train_data_path, transform= data_transform['train']\n",
    ")\n",
    "test_data = torchvision.datasets.ImageFolder(\n",
    "    root= test_data_path, transform= data_transform['test']\n",
    ")\n",
    "train_dataloader = DataLoader(\n",
    "    dataset= train_data, \n",
    "    batch_size= 64)\n",
    "test_dataloader = DataLoader(\n",
    "    dataset= test_data,\n",
    "    batch_size= 64\n",
    ")\n",
    "\n",
    "# 定义优化函数以及损失函数\n",
    "loss_fn = nn.CrossEntropyLoss()\n",
    "optimizer = optim.Adam(model.parameters(), lr=1e-3)\n",
    "\n",
    "def test_acc(data, model):\n",
    "    \"\"\"\n",
    "    data为通过分批次处理好的数据\n",
    "    返回测试集的准确率\n",
    "    \"\"\"\n",
    "    model.eval()\n",
    "    correct = 0.0\n",
    "    with torch.no_grad():\n",
    "        for x,y in data:\n",
    "            x, y = x.to(device), y.to(device)\n",
    "            pred = model(x)\n",
    "            correct += (pred.argmax(1)== y).type(torch.float).sum().item()\n",
    "    return correct/ len(data.dataset)\n",
    "\n",
    "epochs = 21\n",
    "# 没有对数据进行优化，可以选择去对数据进行优化，比如正则化、添加droup out、添加resnet连接\n",
    "for epoch in range(epochs):\n",
    "    print('Epoch:{}\\n'.format(epoch+1))\n",
    "    for batch, (x,y) in enumerate(train_dataloader):\n",
    "        size = len(train_dataloader.dataset)\n",
    "        model.train()\n",
    "        x, y = x.to(device), y.to(device)\n",
    "\n",
    "        pred = model(x)\n",
    "        loss = loss_fn(pred, y)\n",
    "\n",
    "        optimizer.zero_grad()\n",
    "        loss.backward()\n",
    "        optimizer.step()\n",
    "    \n",
    "        if batch %100 == 0:\n",
    "            loss, current = loss.item(), (batch+ 1)* len(x) \n",
    "            print(f\"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]\")\n",
    "    correct = test_acc(test_dataloader, model)\n",
    "    print(f\"Test Error: \\n Accuracy: {(100*correct):>0.1f}% \\n\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "base",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
